Improved Cross-Corpus Speech Emotion Recognition Using Deep Local Domain Adaptation
نویسندگان
چکیده
Due to the scarcity of high-quality labeled speech emotion data, it is natural apply transfer learning recognition. However, learning-based recognition becomes more challenging because complexity and ambiguity emotion. Domain adaptation based on maximum mean discrepancy considers marginal alignment source domain target domain, but not pay regard class prior distribution in both domains, which results reduction efficiency. In order address problem, this study proposes a novel cross-corpus framework local adaption. A category-grained used evaluate distance between two relevant domains. According research findings, generalization ability model enhanced by using adaptive method. Compared with global non-adaptive methods, effectiveness significantly improved.
منابع مشابه
Speech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملA cross-corpus experiment in speech emotion recognition
In this work we will introduce EmoSTAR as a new emotional database and perform cross-corpus tests between EmoSTAR and EmoDB (Berlin Emotional Database) using one of the two databases as training set and the other as test set. We will also investigate the performance of feature selectors in both databases. Feature extraction will be implemented with openSMILE toolkit employing Emobase and Emo_la...
متن کاملAn unsupervised deep domain adaptation approach for robust speech recognition
This paper addresses the robust speech recognition problem as a domain adaptation task. Specifically, we introduce an unsupervised deep domain adaptation (DDA) approach to acoustic modeling in order to eliminate the training–testing mismatch that is common in real-world use of speech recognition. Under a multi-task learning framework, the approach jointly learns two discriminative classifiers u...
متن کاملEfficient Emotion Recognition from Speech Using Deep Learning on Spectrograms
We present a new implementation of emotion recognition from the para-lingual information in the speech, based on a deep neural network, applied directly to spectrograms. This new method achieves higher recognition accuracy compared to previously published results, while also limiting the latency. It processes the speech input in smaller segments – up to 3 seconds, and splits a longer input into...
متن کاملSpeech Emotion Recognition Considering Local Dynamic Features
Recently, increasing attention has been directed to the study of the speech emotion recognition, in which global acoustic features of an utterance are mostly used to eliminate the content differences. However, the expression of speech emotion is a dynamic process, which is reflected through dynamic durations, energies, and some other prosodic information when one speaks. In this paper, a novel ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Chinese Journal of Electronics
سال: 2023
ISSN: ['1022-4653', '2075-5597']
DOI: https://doi.org/10.23919/cje.2021.00.196